New York is the city that never sleeps and it is often known as the busiest city in world. The taxis in New York are the lifeblood of the city. There are over 13,000 active yellow cabs in NYC and they do millions of trips in a month. We decided to analyze this data. The following analysis is about the yellow taxi trips in New York for one month.

The data was in a parquet file, so we read it in a dataframe and converted it to a csv format

Out of 3.1 million rows, only around 93,000 of them seem to have null values. Which is not a very significant amount

Next I created a dates and locations variable which had a list of all the unique pickup dates and pickup locations respectively

Here we created an interactive function that allows the user to pick the date and location to view the total amount of money earned from pickups in the chosen area at the chosen date. There is also a breakdown of the total cost into its many components

From the above visualization, we can conclude that Tuesdays and Wednesdays are the busiest days of the week. Taxis are used the least on Sundays.

As a Taxi Driver, I will know that I am more likely to get more customers on Tuesdays and Wednesdays.

From this visualization, we can dedude that afternoon and evening hours see more customers. The busiest hours of the day are between 10 am to 9 pm.

On most days, Vendor 2 followed by Vendor 1 is preferred by Customers.

On all days of the week, payment type 1 is preferred. While payment type 4 and 5 are rarely used.

From this visualization, we can conclude that higher tips are given after midnight especially at 12-1 am and 4-6am.

From this visualization, we dedude that fares are higher after midnight. I'm guessing that taxis have a higher rate for night time.

This visualization needs to be worked on further. I can't accurately tell which days receive the most tips from the customers.

Similarly, I can't accurately tell which ones cover the most distance. Need to be worked upon.

There are some extra charges on top of the fare amount, such as - mta tax, tolls amount, airport fee etc. I wanted to visualize - which of these extra fees are charged more or less given a particular day of the week.